IntroductionΒΆ
Welcome to HoloViews!
This tutorial explains the basics of how to use HoloViews to explore your data. If this is your first contact with HoloViews, you may want to start by looking at our showcase to get a quick idea of what can be achieved with HoloViews. If this introduction does not cover the type of visualizations you need, you should check out our Elements and Containers overviews to see what else is available.
What is HoloViews?¶
HoloViews allows you to collect and annotate your data in a way that reveals it naturally, with a minimum of effort needed for you to see your data as it actually is. HoloViews is not a plotting library -- it connects your data to plotting code implemented in other packages, such as matplotlib. HoloViews is also not primarily a mass storage or archival data format like HDF5 -- it is mainly designed to package your data to make it maximally visualizable and viewable interactively.
If you supply just enough additional information to the data of interest, HoloViews allows you to store, index, slice, analyze, reduce, compose, display, and animate your data as naturally as possible. HoloViews makes your numerical data come alive, revealing itself easily and without extensive coding.
Here are a few of the things HoloViews allows you to associate with your data:
- The Element type. This encapsulates your data and is the most fundamental indicator of how your data can be analyzed and displayed. For instance, if you wrap a 2D numpy array in a
Imageit will be displayed with as an image with a colormap by default, aCurvewill be presented as a line plot on an axis, and so on. Once your data has been encapsulated in anElementobject, otherElementscan easily be created from it, such as obtaining aCurveby taking a slice of aImage.
- Dimensions of your data. The
key_dimensionsdescribe how your data can be indexed. Thevalue_dimensionsdescribe what the resulting indexed data represents. A numericalDimensioncan have a name, type, range, and unit. This information allows HoloViews to rescale and label axes and allows HoloViews be smart in how it processes your data.
- The multi-dimensional space in which your data resides. This may be space as we normally think of it (in x, y, and z coordinates). It may be the spatial position of one component relative to another. Or it may be an entirely abstract space, such as a parameter space or a list of experiments done on different days. Whatever multi-dimensional space characterizes how one chunk of your data relates to another chunk, you can embed your data in that space easily, sparsely populating whatever region of that space you want to analyze.
- How your data should be grouped for display. In short, how you want your data to be organized for visualization. If you have a collection of points that was computed from an image, you can easily overlay your points over the image. As a result you have something that both displays sensibly, and is grouped together in a semantically meaningful way.
HoloViews can display your data even if it knows only the Element type, which lets HoloViews stay out your way when initially exploring your data, offering immediate feedback with reasonable default visualizations. As your analysis becomes more complex and your research progresses, you may offer more of the useful metadata above so that HoloViews will automatically improve your displayed figures accordingly. Throughout, all you need to supply is this metadata plus optional plotting hints (such as choosing specific colors if you like), rather than having to write cumbersome code to put figures together or having to paste bits together in an external drawing or plotting program.
Note that the HoloViews component types each have only minimal required dependencies (Numpy and Param, both with no required dependencies of their own). This data format can thus be integrated directly into your research or development code if you wish (see e.g. the ImaGen library for an example). The plotting is currently implemented using matplotlib, but the components do not in any way depend on matplotlib directly, so that other packages could be used for the same data in the future if needed. Similarly, HoloViews provides strong support for the IPython notebook interface, and we recommend using the notebook for building reproducible yet interactive workflows, but none of the components require IPython either. Thus HoloViews is designed to fit into your existing workflow, without adding complicated dependencies.
Getting Started¶
To enable IPython integration, you need to load the IPython extension as follows:
import holoviews
%load_ext holoviews.ipython
As HoloViews makes extensive use of Numpy to hold raw data, a qualified numpy import is recommended:
import numpy as np
Interactive Documentation ¶
HoloViews has extremely well-documented and error-checked constructors for every class (provided by the Param library). We have made sure to provide a number of convenient ways to access this information interactively. E.g. if you have imported Element:
from holoviews import Element
You can now access class and parameter documentation in the following ways:
- IPython's help syntax: Type
Element?(or mostly equivalently,help(Element)) and then shift-Enter - Repeatedly press
<Shift+TAB>to get more information after opening the constructor:Element(<Shift+TAB> - Type the
%paramsmagic to view information in the pager for an object in the namespace:%params Element(and then shift-Enter)
Lastly, you can tab-complete arguments to HoloViews classes, so if you try `Element(va<TAB>,
you will see the available keyword arguments (value_dimensions in this case).
A simple visualization¶
To begin, let's see how HoloViews stays out your way when initially exploring some data. Let's view an image, selecting the appropriate RGB Element to do so:
from holoviews import RGB
Now, although we could immediately load our image into the RGB object, we will first load it into a raw Numpy array:
parrot = RGB.load_image('../assets/macaw.png', array=True)
print "%s with shape %s" % (type(parrot),parrot.shape)
As we can see this 400×400 image data array has four channels (the fourth being an unused alpha channel). Now let us make an RGB element to wrap up this Numpy array with its associated label:
rgb_parrot = RGB(parrot, label='Macaw')
rgb_parrot
Here rgb_parrot is an RGB HoloViews element, which requires 3-4 dimensional data and can store an associated label. rgb_parrot is not a plot -- it is just a data structure with some metadata. The holoviews.ipython extension, in turn, makes sure that any RGB element is displayed appropriately, i.e. as a color image with an associated optional title, plotted using matplotlib. But the RGB object itself does not have any connection to the plotting library, and stores no data about the plot, just its own data, which is sufficient for the external plotting routines to visualize the data usefully and meaningfully.
Because rgb_parrot is just our actual data, it can be composed with other objects, pickled, and analyzed as-is. For instance, we can still access the underlying Numpy array easily via .data attribute, and can verify that it is indeed our actual data:
rgb_parrot.data is parrot
Note that this is generally true throughout HoloViews; if you pass a HoloViews element a Numpy array of the right shape, the .data attribute will simply be a reference to the data you supplied. If you use an alternative data format when constructing an element, such as a Python list, a Numpy array of the appropriate type will be created and made available through the .data attribute. You can always use the identity check demonstrated above if you want to make absolutely sure your raw data is being used directly.
As you compose these objects together, you will see that a complex visualization is not simply a visual display, but a rich data structure containing all the raw data or analyzed data ready for further manipulation and analysis.
Viewing individual color channels¶
For many analysis purposes, working in RGB colour space is rather limiting: it is often more flexible to work with a single N×M array at a time and visualize the data in each channel using a colormap. To do this we need the Image Element instead of the RGB Element.
To illustrate, let's start by visualizing the total luminance across all the channels of the parrot image, choosing a specific colormap using the HoloViews %%opts IPython cell magic. %%opts Image allows us to pass plotting hints to the underlying visualization code for Image objects:
%%opts Image style(cmap='coolwarm')
from holoviews import Image
luminance = Image(parrot.sum(axis=2), label='Summed Luminance')
luminance
This result is what we would expect: dark areas are shown in blue and bright areas are shown in red. Notice how the plotting hints (your desired colormap in this case) are kept separate from your actual data, so that the Image data structure contains only your actual data and the metadata that describes it, not incidental information like matplotlib options. We will now set the default colormap to grayscale for all subsequent cells using the %opt command, and we will come back to explain the %%opts and %opts magics in more detail later in the tutorial.
We will now look at a single color channel by building an appropriate Image element:
%opts Image style(cmap='gray')
red = Image(parrot[:,:,0], label='Red')
red
Here we created the red Image directly from numpy array parrot. You can also make a lower-dimensional HoloViews component by slicing a higher-dimensional one. For instance, now we will combine this manually constructed red channel with green and blue channels constructed by slicing the rgb_parrot RGB HoloViews object to get the appropriate Image objects:
channels = red + rgb_parrot[:,:,'G'].relabel('Green') + rgb_parrot[:,:,'B'].relabel('Blue')
channels
Here we have combined these three HoloViews objects using the compositional operator + to create a new object named channels. When channels is displayed by the IPython notebook, each Image is shown side by side, with appropriate labels. In this format, you can see that the parrot looks quite distinctly different in the red channel than the green and blue channel.
Note that the channels object isn't merely a display of three elements side by side, but a new composite object of type Layout containing our three Image objects:
print repr(channels)
This object offers very convenient (tab-completable) attribute access to the components using the semantically meaningful labels we have assigned:
channels.Image.Blue
This allows us to recompose our data, here to compare the Red and Blue channels together more directly:
channels.Image.Red + channels.Image.Blue
Notice how the labels we have set are useful for both the titles and for the indexing, and are thus not simply plotting-specific details -- they are semantically meaningful metadata describing this data.
Grouping into Layouts ¶
You may wonder what the ".Image" is doing in the middle of the indexing above. This is the group name which, even though we haven't set it directly, is as important a concept as the label. The group is a string description of the category or the semantic type of the data. I.e., the value is what kind of thing this data is, and the label is your name for this particular piece of data.
By default, the value is the same as the name of the HoloViews element type, in this case Image:
channels.Image.Blue.group
The group is an extremely useful grouping mechanism that allows you to structure your data in meaningful ways. As we noted above, the red channel is the most clearly different from the other two, and we can make it special if we wish by grouping it separately using the value:
channels = ( Image(parrot[:,:,0], group='RedChannel', label='Macaw')
+ Image(parrot[:,:,1], group='Channel', label='Green')
+ Image(parrot[:,:,2], group='Channel', label='Blue'))
The red channel is given its own special group 'RedChannel' while the other two channels are grouped under the generic Channel. Here are the two channels under Channel, now easily accessible as a group:
channels.Channel
And now we can access the interesting red channel:
channels.RedChannel
Of course, you could also access the other two channels individually using channels.Channel.Green and channels.Channel.Blue respectively. We choose how much indexing we need to get at our elements and what groups are meaningful for us to manipulate together.
Now let's look at the whole set together:
channels
Although visually very similar to what we had in Out [10], we now have a nested data structure (a tree data structure called a Layout) with a different organization, as described in the next section.
The Layout datastructure¶
Here is the Layout object data structure we built using the + operator:
print repr(channels)
All the raw data we have used is stored inside the Image objects that are easily accessible in this tree. We index the tree by group and label e.g. RedChannel.Macaw. or Channel.Green.
The elements themselves are identified according to the scheme {type}.{group}.{label}. Note that the indexing of the tree follows the value and label of the contained elements. This is the default behaviour and almost always true unless you explicitly set otherwise.
You may wonder what the string in parentheses (z) means. This the name of the default value dimension for our Image objects. Again, this can be set to whatever name is appropriate to describe the elements of the two-dimensional array.
channels.RedChannel.Macaw.value_dimensions[0]
A Layout is an incredibly convenient and versatile way of collecting data together semantically in a way that conveys how to display it (i.e., as separate visualizations alongside each other). As we have seen, it allows data to be grouped by group as well as indexed to select individual elements of the tree. Note that if you want to see the textual representation of this structure rather than plots, e.g. for debugging, you can use the IPython %pprint magic to toggle between repr and rich display, or you can print repr(object).
Grouping into Overlays¶
Putting two components (Elements or Containers) side by side into a Layout using + is one of the most common operations in HoloViews, and works with any possible component type. But there is another compositional operator * that is also very useful for creating complex visualizations, by overlaying components on top of each other. Nearly all components can be overlaid as well, except for a Layout; a Layout always contains Overlays and never the other way around.
Pointing to our parrot¶
One type of element designed specifically for overlaying is the annotation. Here we use the Arrow Element to label our parrot using the original RGB object with the overlay (*) operator:
from holoviews import Arrow
extents = (-0.5, -0.5, 0.5, 0.5) # Image spatial extents
o = rgb_parrot * Arrow(-0.1,0.2, 'Polly', '>', extents=extents)
o
An overlay is a compositional datastructure, just like Layout (it is in fact a subclass!). This means the same indexing and grouping sematics apply. To illustrate we can index our overlay to pull it apart and lay the two components side by side:
o + o.RGB.Macaw + o.Arrow.I
Note that when there is no label available for an object in a Layout or Overlay, HoloViews will generate an appropriate Roman numeral identifier for indexing. In this case we index our arrow using Arrow.I. Naturally, overlays may themselves be elements of a Layout, as at left above.
Overlaying contours¶
Overlays may be simple annotations as demonstrated above, but often they can contain significant volumes of important data. To demonstrate, we will introduce the concept of operations and the Contours Element:
from holoviews.operation import contours
This operation takes an Image as input and generates an overlay for us, where our original input is returned with contour lines overlaid on top. Let's have a look at the 10% (darkest) and 80% (brightest) areas of the red channel:
contours(channels.RedChannel.Macaw, levels=[0.10,0.80])
The colors and the line widths here are using default values, in this case cyan for the 10% contour and red for the 80% contour. If we want to change those, we can do so using the %%opts cell magic, described below.
The %%opts and %opts magics¶
If the line width seems too thick for our purposes, we can change it using the %%opts (options) magic to pass hints to the plotting system:
%%opts Contours (linewidth=1.3)
rcontours = contours(channels.RedChannel.Macaw, levels=[0.10,0.80])
rcontours
The plotting system is entirely separate from all the HoloViews components discussed above and in the Elements tutorial, and is accessed via this special mechanism to ensure that plotting-specific hints that are not semantic properties of your data do not get mixed up with the data itself. Here, the first token supplied to %%opts is Contours. This token specifies the HoloViews type that we want to style differently; we could have used Image if we wanted to change something about the underlying image instead.
Next we specify the type of option we want to change, using style(keyword=value,...) (or more compactly (keyword=value ...), as here) to indicate that we are supplying plotting styles, i.e. keywords to be passed directly to the underlying plotting system. The keywords supported here are simply those provided by
matplotlib for the current implementation, but different keywords would be supported if the underlying backend changed (since HoloViews does not store any list of all the keywords that might be supported by a plotting library).
In addition to the style options, we can pass other plot options using
plot(keyword=value,...) (or simply [keyword=value...] for short, with square brackets). These options are given as parameters to the objects in the plotting code of HoloViews itself, not to the underlying plotting library. For instance, plot titles and sizes can be changed at this level. The list of parameters supported for any given component is listed in the HoloViews
Reference Manual.
A third type of options can be passed using
norm(+axiswise|+framewise) (or simply {+axiswise|+framewise} for short, with curly brackets). These options are also passed to the HoloViews plotting system, but are a separate set of options that control normalization, enabling (with +) or disabling (with -) normalization within a Layout or between components of a given type. Normalization is a major feature of HoloViews, with sizes, ranges, and values all being coordinated across Elements in a Layout, and these options allow this behavior to be controlled precisely to make sure that the important aspects of your data are visible.
Note that tab-completion is available wherever possible when specifying the magic. Also note that options specified with %%opts apply only to objects defined in the specific IPython Notebook cell for which they are supplied, but you can replace the cell magic %%opts with the corresponding line magic %opts (with a single %) to apply to all subsequent cells of your notebook as well.
Animations and slider bars¶
The final topic for the introduction is animations. Animation relies on a powerful multidimensional data container called a HoloMap, which is described in detail in the Exploring Data tutorial. Here, as a brief illustration, we show how to construct three HoloMaps from sets of Images with Contours, constructed using a list of different threshold levels for the above image.
As you can see in the above plot, having a large number of threshold levels would be very difficult to include in a single plot. In such a case, one could lay them all out side by side, but here we show how to combine them into three HoloMap objects that support animation:
%%opts Contours.Red (color=Palette('Reds')) Contours.Green (color=Palette('Greens')) Contours.Blue (color=Palette('Blues'))
from holoviews import Layout, HoloMap
import numpy
data = {lvl:(contours(channels.RedChannel.Macaw, levels=[lvl], group='Red') +\
contours(channels.Channel.Green, levels=[lvl], group='Green') +\
contours(channels.Channel.Blue, levels=[lvl], group='Blue'))
for lvl in numpy.linspace(0.1,0.9,9)}
levels = Layout.collate(data, key_dimensions=['Levels'])
levels
The levels object here is a Layout, just as in the other examples above, but it is displayed as an animation because it happens to contain three HoloMaps that have an additional dimension Levels beyond what has been laid out spatially in each image. There's no other special implementation necessary to get animations; they appear automatically whenever there are these additional dimensions in a HoloMap that haven't been sliced, sampled, or reduced down enough to fit into a single plot. Your data, as always, remains available within the object, if you later want to pull out portions of it to display without an animation:
green05 = levels.Overlay.Green[0.5]
green05 + green05.Channel + green05.Channel.Green.sample(y=0.0)
Now that you understand the basic concepts of HoloViews, it's worth checking out the full features of the HoloMap component, as well as all the other types of elements and containers. Have fun!